2023-12-18
In case we need additional packages we ask BPLIM’s staff to install them (tools folder)
A more flexible approach is the use of containers – reproducibility and autonomy regarding packages and versions
“A container is a lightweight, stand-alone, executable package of software that includes everything needed to run a piece of software, including the code, runtime, system tools, libraries, and settings. Containers are isolated from each other and the host system. This isolation allows for efficient, reliable, and consistent deployment of applications, regardless of the environment.” (ChatGPT, 2023)
“Text document that serves as a blueprint for creating a Singularity container image. This file, typically having a .def extension, contains specific instructions and settings for the container. It outlines the base environment, including the base OS, any required applications, libraries, and dependencies.” (ChatGPT, 2023)
A detailed manual on how to build and use containers is available at BPLIM’s GitHub:
Definition files are available at BPLIM’s GitHub: https://github.com/BPLIM/Containers
“Git is a distributed version control system, primarily used for source code management in software development. It allows multiple developers to work on the same project simultaneously without interfering with each other’s changes. Git tracks the progress of changes in a series of snapshots, enabling users to revert back to previous versions of their work if necessary. It’s known for its speed, data integrity, and support for distributed, non-linear workflows.” (ChatGPT, 2023)
A detailed manual on how to setup and use Git in the remote server is available at BPLIM’s GitHub:
https://github.com/BPLIM/Manuals/tree/master/ExternalServer/Git
BPLIM Team developed a tool to streamline the replicability of the research project.
work_area and click in ReplicationApp.desktop
master.do file
json fileIn work_area folder the file structure.json has the different sets of information
Folder ados: ado files programmed by the researcher.
Folder code: contains the code used to replicate all the analysis performed by the researcher.
Folder results: outcomes of the statistical analysis. This is the folder that will be shared with the researcher after output control.
Library for Public Images: https://cloud.sylabs.io/library/reisportela/bplim/bplim_stata17_python310
Go to Sylabs, https://cloud.sylabs.io/, Sign up and Sign in
Go to Remote Builder
Copy/paste the definition file into the text box
Give a name to the container and click in Submit Build
BPLIM_Stata17_Python310_from_Sylabs_V4.def
To build the container you must have a valid Stata 17 license
When building the container the file Stata_ados_BASE.do is used to install the ado files you need
In case you need additional Linux packages in your container they can be added in the section %post of the definition file. See further details at https://github.com/BPLIM/Containers/tree/main/Stata
The use of parquet files is made available by Mauricio Caceres and can be used in the remote server
WORKSHOP on Automation of the Research Process